Skip to content

Conversation

@morgan-wowk
Copy link
Collaborator

@morgan-wowk morgan-wowk commented Feb 2, 2026

Pipeline Metrics Instrumentation

Tracks pipeline lifecycle from creation to completion with labels for status and user.

Metrics added:

  • pipeline_runs_total: Counter tracking pipeline runs by status (running/succeeded/failed/cancelled) and created_by
  • pipeline_run_duration_seconds: Histogram tracking total pipeline duration by final status

These metrics provide visibility into pipeline success rates, completion times, and usage patterns per user.
Durations measure total lifecycle time from creation to terminal state (including queue and execution time).

Copy link
Collaborator Author

morgan-wowk commented Feb 2, 2026

Tracks pipeline lifecycle from creation to completion with labels for status and user.

Metrics added:
- pipeline_runs_total: Counter tracking pipeline runs by status (running/succeeded/failed/cancelled) and created_by
- pipeline_run_duration_seconds: Histogram tracking total pipeline duration by final status

These metrics provide visibility into pipeline success rates, completion times, and usage patterns per user.
Durations measure total lifecycle time from creation to terminal state (including queue and execution time).
@Ark-kun
Copy link
Contributor

Ark-kun commented Feb 6, 2026

I'm not fully sure we should be doing this on the app side.
Backend can be restarted at any time (e.g. new version is deployed). With current implementation in the PR, this seems to affect the metrics. But I think it shouldn't.

I'm not sure about reporting pipeline run times like this. Tracking execution durations might be more OK.

I'm not sure the backend should be aggregating run's execution status statistics into a single status. This single status will likely be not useful for many of the users. I think it's better when the users can get the full status stats and UI can provide derivatives.

@morgan-wowk
Copy link
Collaborator Author

I'm not fully sure we should be doing this on the app side. Backend can be restarted at any time (e.g. new version is deployed). With current implementation in the PR, this seems to affect the metrics. But I think it shouldn't.

I'm not sure about reporting pipeline run times like this. Tracking execution durations might be more OK.

I'm not sure the backend should be aggregating run's execution status statistics into a single status. This single status will likely be not useful for many of the users. I think it's better when the users can get the full status stats and UI can provide derivatives.

Thanks for jumping in to these draft PRs and giving some early feedback. Especially this one. After all our discussion, that does sound like the best thing to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants